131 research outputs found

    Structural Analysis: Shape Information via Points-To Computation

    Full text link
    This paper introduces a new hybrid memory analysis, Structural Analysis, which combines an expressive shape analysis style abstract domain with efficient and simple points-to style transfer functions. Using data from empirical studies on the runtime heap structures and the programmatic idioms used in modern object-oriented languages we construct a heap analysis with the following characteristics: (1) it can express a rich set of structural, shape, and sharing properties which are not provided by a classic points-to analysis and that are useful for optimization and error detection applications (2) it uses efficient, weakly-updating, set-based transfer functions which enable the analysis to be more robust and scalable than a shape analysis and (3) it can be used as the basis for a scalable interprocedural analysis that produces precise results in practice. The analysis has been implemented for .Net bytecode and using this implementation we evaluate both the runtime cost and the precision of the results on a number of well known benchmarks and real world programs. Our experimental evaluations show that the domain defined in this paper is capable of precisely expressing the majority of the connectivity, shape, and sharing properties that occur in practice and, despite the use of weak updates, the static analysis is able to precisely approximate the ideal results. The analysis is capable of analyzing large real-world programs (over 30K bytecodes) in less than 65 seconds and using less than 130MB of memory. In summary this work presents a new type of memory analysis that advances the state of the art with respect to expressive power, precision, and scalability and represents a new area of study on the relationships between and combination of concepts from shape and points-to analyses

    Principal arc analysis on direct product manifolds

    Get PDF
    We propose a new approach to analyze data that naturally lie on manifolds. We focus on a special class of manifolds, called direct product manifolds, whose intrinsic dimension could be very high. Our method finds a low-dimensional representation of the manifold that can be used to find and visualize the principal modes of variation of the data, as Principal Component Analysis (PCA) does in linear spaces. The proposed method improves upon earlier manifold extensions of PCA by more concisely capturing important nonlinear modes. For the special case of data on a sphere, variation following nongeodesic arcs is captured in a single mode, compared to the two modes needed by previous methods. Several computational and statistical challenges are resolved. The development on spheres forms the basis of principal arc analysis on more complicated manifolds. The benefits of the method are illustrated by a data example using medial representations in image analysis.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS370 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Efficient Context-Sensitive Shape Analysis with Graph Based Heap Models

    Full text link
    The performance of heap analysis techniques has a significant impact on their utility in an optimizing compiler.Most shape analysis techniques perform interprocedural dataflow analysis in a context-sensitive manner, which can result in analyzing each procedure body many times (causing significant increases in runtime even if the analysis results are memoized). To improve the effectiveness of memoization (and thus speed up the analysis) project/extend operations are used to remove portions of the heap model that cannot be affected by the called procedure (effectively reducing the number of different contexts that a procedure needs to be analyzed with). This paper introduces project/extend operations that are capable of accurately modeling properties that are important when analyzing non-trivial programs (sharing, nullity information, destructive recursive functions, and composite data structures). The techniques we introduce are able to handle these features while significantly improving the effectiveness of memoizing analysis results (and thus improving analysis performance). Using a range of well known benchmarks (many of which have not been successfully analyzed using other existing shape analysis methods) we demonstrate that our approach results in significant improvements in both accuracy and efficiency over a baseline analysis

    A static heap analysis for shape and connectivity: Unified memory analysis: The base framework

    Get PDF
    Modeling the evolution of the state of program memory during program execution is critical to many parallehzation techniques. Current memory analysis techniques either provide very accurate information but run prohibitively slowly or produce very conservative results. An approach based on abstract interpretation is presented for analyzing programs at compile time, which can accurately determine many important program properties such as aliasing, logical data structures and shape. These properties are known to be critical for transforming a single threaded program into a versión that can be run on múltiple execution units in parallel. The analysis is shown to be of polynomial complexity in the size of the memory heap. Experimental results for benchmarks in the Jolden suite are given. These results show that in practice the analysis method is efflcient and is capable of accurately determining shape information in programs that créate and manipúlate complex data structures

    Visualizing genetic constraints

    Get PDF
    Principal Components Analysis (PCA) is a common way to study the sources of variation in a high-dimensional data set. Typically, the leading principal components are used to understand the variation in the data or to reduce the dimension of the data for subsequent analysis. The remaining principal components are ignored since they explain little of the variation in the data. However, evolutionary biologists gain important insights from these low variation directions. Specifically, they are interested in directions of low genetic variability that are biologically interpretable. These directions are called genetic constraints and indicate directions in which a trait cannot evolve through selection. Here, we propose studying the subspace spanned by low variance principal components by determining vectors in this subspace that are simplest. Our method and accompanying graphical displays enhance the biologist's ability to visualize the subspace and identify interpretable directions of low genetic variability that align with simple directions.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS603 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Program Synthesis using Natural Language

    Get PDF
    Interacting with computers is a ubiquitous activity for millions of people. Repetitive or specialized tasks often require creation of small, often one-off, programs. End-users struggle with learning and using the myriad of domain-specific languages (DSLs) to effectively accomplish these tasks. We present a general framework for constructing program synthesizers that take natural language (NL) inputs and produce expressions in a target DSL. The framework takes as input a DSL definition and training data consisting of NL/DSL pairs. From these it constructs a synthesizer by learning optimal weights and classifiers (using NLP features) that rank the outputs of a keyword-programming based translation. We applied our framework to three domains: repetitive text editing, an intelligent tutoring system, and flight information queries. On 1200+ English descriptions, the respective synthesizers rank the desired program as the top-1 and top-3 for 80% and 90% descriptions respectively
    • …
    corecore